Mining California Vital Statistics Data

نویسندگان

  • Du Zhang
  • Quoc Luan Ha
  • Meiliu Lu
چکیده

Vital statistics data offer a fertile ground for data mining. In this paper, we discuss the results of a data mining project on the causes of death aspect of the vital statistics data in the state of California. A data mining tool called Cubist is used to build predictive models out of two million cases over a nine-year period. The objective of our study is to discover knowledge that can be used to gain insight into various aspects of mortality in California, to predict health issues related to the causes of death, to offer an aid to decisionor policy-making process, and to provide useful information services to the customers. The results obtained in our study contain valuable new information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An evaluation of California's inferred birth statistics for unmarried women.

The quality and reliability of birth statistics for unmarried women based on inferential data are evaluated in this methodological study. All material appearing in this report is in the public domain and may be reproduced or copied without permission; citation as to source, however, is appreciated. An evaluation of California's inferred birth statistics for unmarried women. Acknowledgments This...

متن کامل

Prediction of Cancer Count through Artificial Neural Networks Using Incidence and Mortality Cancer Statistics Dataset for Cancer Control Organizations

The ultimate goal of data mining is prediction, and predictive data mining is the most common type of data mining and one that has most direct business applications. This paper discusses how data mining will help in predicting cancer count for cancer statistics’ datasets. This paper discusses neural network and accurate prediction methods. Neural network is an adaptive system that changes its s...

متن کامل

Implementation Brief: Use of Commercial Record Linkage Software and Vital Statistics to Identify Patient Deaths

We evaluate the ability of a microcomputer program (Automatch) to link patient records in our hospital's database (N = 253,836) with mortality files from California (N = 1,312,779) and the U.S. Social Security Administration (N = 13,341,581). We linked 96.5% of 3,448 in-hospital deaths, 99.3% for patients with social security numbers. None of 14,073 patients known to be alive (because they were...

متن کامل

Casino Fraud Data Mining

Average revenue per casino hotel resort per year is $87,887,253 [1]. This much revenue attracts fraud and criminals leading to millions in lost revenue [2]. Recently, casinos have begun to track patrons. Their vital statistics and spending habits are all recorded in massive databases. This, however, has led to the challenge of extracting the pertinent information from these data sets and how to...

متن کامل

Significance tests for unsupervised pattern discovery in large continuous multivariate data sets

In this paper we consider the question of uncertainty of discovered patterns in data mining. In particular, we develop statistical tests for flagged patterns found in continuous data, where such patterns are perhaps more familiar to statisticians as local modes in the data. We indicate the significance of these patterns in terms of the probability that they have occurred by chance. We examine t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCAT

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2001